L1-based compression of random forest models
نویسندگان
چکیده
High-dimensional supervised learning problems, e.g. in image exploitation and bioinformatics, are more frequent than ever. Tree-based ensemble methods, such as random forests (Breiman, 2001) and extremely randomized trees (Geurts et al., 2006), are effective variance reduction techniques offering in this context a good trade-off between accuracy, computational complexity, and interpretability.
منابع مشابه
Comparison of Ordinal Response Modeling Methods like Decision Trees, Ordinal Forest and L1 Penalized Continuation Ratio Regression in High Dimensional Data
Background: Response variables in most medical and health-related research have an ordinal nature. Conventional modeling methods assume predictor variables to be independent, and consider a large number of samples (n) compared to the number of covariates (p). Therefore, it is not possible to use conventional models for high dimensional genetic data in which p > n. The present study compared th...
متن کاملComparison of Tourism Placement and Development Models from Land Use Planning perspective in Zagros Forests Case Study: Javanrud County
While in recent years, due to numerous reasons, the amount of travel and tourism has increased, the amount of problems caused by this activity is also considered by managers. By using presence points of tourists in Javanrud County, Analytic hierarchy process (AHP) and Random Forest (RF) models, the conditions of establishment of tourists from the aspect of land use planning was investigated. In...
متن کاملImprovement of Support Vector Machine and Random Forest Algorithm in Predicting Khorramabad River Flow Uusing Non-uniform De-Noising of data and Simplex Algorithm
In this study, in order to simulate the monthly flow of the Khorramabad River, the time series of this river was decomposed into three levels using the wavelet of Daubechies-3, during the period of 1955-2014. Based on this, it was found that there is a Non-uniform noise that includes two periods of time in this signal, with the October 2008 border which required that the signal be become non-un...
متن کاملPrognosis of multiple sclerosis disease using data mining approaches random forest and support vector machine based on genetic algorithm
Background: Multiple sclerosis (MS) is a degenerative inflammatory disease which is most commonly diagnosed by magnetic resonance imaging (MRI). But, since the MRI device uses of a magnetic field, if there are metal objects in the patient's body, it can disrupt the health of the patient, the functioning of the MRI, and distortion in the images. Due to limitations of using MRI device, screening ...
متن کاملComparison of Random Survival Forests for Competing Risks and Regression Models in Determining Mortality Risk Factors in Breast Cancer Patients in Mahdieh Center, Hamedan, Iran
Introduction: Breast cancer is one of the most common cancers among women worldwide. Patients with cancer may die due to disease progression or other types of events. These different event types are called competing risks. This study aimed to determine the factors affecting the survival of patients with breast cancer using three different approaches: cause-specific hazards regression, subdistri...
متن کامل